Early Improving Recurrent Elastic Highway Network

نویسندگان

  • Hyunsin Park
  • Chang Dong Yoo
چکیده

To model time-varying nonlinear temporal dynamics in sequential data, a recurrent network capable of varying and adjusting the recurrence depth between input intervals is examined. The recurrence depth is extended by several intermediate hidden state units, and the weight parameters involved in determining these units are dynamically calculated. The motivation behind the paper lies on overcoming a deficiency in Recurrent Highway Networks and improving their performances which are currently at the forefront of RNNs: 1) Determining the appropriate number of recurrent depth in RHN for different tasks is a huge burden and just setting it to a large number is computationally wasteful with possible repercussion in terms of performance degradation and high latency. Expanding on the idea of adaptive computation time (ACT), with the use of an elastic gate in the form of a rectified exponentially decreasing function taking on as arguments as previous hidden state and input, the proposed model is able to evaluate the appropriate recurrent depth for each input. The rectified gating function enables the most significant intermediate hidden state updates to come early such that significant performance gain is achieved early. 2) Updating the weights from that of previous intermediate layer offers a richer representation than the use of shared weights across all intermediate recurrence layers. The weight update procedure is just an expansion of the idea underlying hypernetworks. To substantiate the effectiveness of the proposed network, we conducted three experiments: regression on synthetic data, human activity recognition, and language modeling on the Penn Treebank dataset. The proposed networks showed better performance than other state-of-theart recurrent networks in all three experiments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A simple RNN-plus-highway network for statistical parametric speech synthesis

In this report, we proposes a neural network structure that combines a recurrent neural network (RNN) and a deep highway network. Compared with the highway RNN structures proposed in other studies, the one proposed in this study is simpler since it only concatenates a highway network after a pre-trained RNN. The main idea is to use the ‘iterative unrolled estimation’ of a highway network to fin...

متن کامل

Learning text representation using recurrent convolutional neural network with highway layers

Recently, the rapid development of word embedding and neural networks has brought new inspiration to various NLP and IR tasks. In this paper, we describe a staged hybrid model combining Recurrent Convolutional Neural Networks (RCNN) with highway layers. The highway network module is incorporated in the middle takes the output of the bidirectional Recurrent Neural Network (Bi-RNN) module in the ...

متن کامل

A Differentiated Pricing Framework for Improving the Performance of the Elastic Traffics in Data Networks

Rate allocation has become a demanding task in data networks as diversity in users and traffics proliferate. Most commonly used algorithm in end hosts is TCP. This is a loss based scheme therefore it exhibits oscillatory behavior which reduces network performance. Moreover, since the price for all sessions is based on the aggregate throughput, losses that are caused by TCP affect other sessions...

متن کامل

Autoregressive Attention for Parallel Sequence Modeling

We introduce an autoregressive attention mechanism for parallelizable characterlevel sequence modeling. We use this method to augment a neural model consisting of blocks of causal convolutional layers connected by highway network skip connections. We denote the models with and without the proposed attention mechanism respectively as Highway Causal Convolution (Causal Conv) and Autoregressive-at...

متن کامل

LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition

Popularized by the long short-term memory (LSTM), multiplicative gates have become a standard means to design artificial neural networks with intentionally organized information flow. Notable examples of such architectures include gated recurrent units (GRU) and highway networks. In this work, we first focus on the evaluation of each of the classical gated architectures for language modeling fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1708.04116  شماره 

صفحات  -

تاریخ انتشار 2017